Handheld audio-visual (AV) recording devices are often used to generate AV recordings of different types of events, such as birthdays, celebrations, as well as live events of artistic or oratory performances. Typically, a handheld AV recording device is equipped with integrated microphones that are capable of converting sound energy waves into low fidelity audio signals. Generally, low fidelity audio signals can adequately characterize the audio of personal events, such as birthdays and celebrations. However, a low fidelity audio signal can lack sufficient sound quality to clearly capture the sound of a live event, thereby detracting listeners from an appreciation of the artistic or oratory performance. Many factors may contribute to a low sound quality. For example, integrated microphones are inherently limited and incapable of converting sound energy waves into high fidelity audio signals. Further, an audio recording may be affected by background noise from other audience members at the live event or other ambient noise. Moreover, the handheld AV recording device may itself be positioned too far away from the performance. Therefore, a need exists to allow audience members of a live event to incorporate a high-fidelity audio signal of the live event into a video segment that records their personal experiences via a handheld AV recording device.